Goto

Collaborating Authors

 sparsity level


Neighbor Embedding for High-Dimensional Sparse Poisson Data

Mudrik, Noga, Charles, Adam S.

arXiv.org Machine Learning

Across many scientific fields, measurements often represent the number of times an event occurs. For example, a document can be represented by word occurrence counts, neural activity by spike counts per time window, or online communication by daily email counts. These measurements yield high-dimensional count data that often approximate a Poisson distribution, frequently with low rates that produce substantial sparsity and complicate downstream analysis. A useful approach is to embed the data into a low-dimensional space that preserves meaningful structure, commonly termed dimensionality reduction. Yet existing dimensionality reduction methods, including both linear (e.g., PCA) and nonlinear approaches (e.g., t-SNE), often assume continuous Euclidean geometry, thereby misaligning with the discrete, sparse nature of low-rate count data. Here, we propose p-SNE (Poisson Stochastic Neighbor Embedding), a nonlinear neighbor embedding method designed around the Poisson structure of count data, using KL divergence between Poisson distributions to measure pairwise dissimilarity and Hellinger distance to optimize the embedding. We test p-SNE on synthetic Poisson data and demonstrate its ability to recover meaningful structure in real-world count datasets, including weekday patterns in email communication, research area clusters in OpenReview papers, and temporal drift and stimulus gradients in neural spike recordings.


Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition

Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

Neural Information Processing Systems

In this paper, we consider a multi-step version of the stochastic ADMM method with efficient guarantees for high-dimensional problems. We first analyze the simple setting, where the optimization problem consists of a loss function and asingleregularizer(e.g.


Navigating Extremes: Dynamic Sparsity in Large Output Spaces

Neural Information Processing Systems

In recent years, Dynamic Sparse Training (DST) has emerged as an alternative to post-training pruning for generating efficient models. In principle, DST allows for a more memory efficient training process, as it maintains sparsity throughout the entire training run. However, current DST implementations fail to capitalize on this in practice. Because sparse matrix multiplication is much less efficient than dense matrix multiplication on GPUs, most implementations simulate sparsity by masking weights.


Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?

Neural Information Processing Systems

PaI methods manage to find trainable subnetworks that outperform random pruning, their performance in terms of both accuracy and computational reduction is far from satisfactory compared to post-training pruning and the understanding of PaI is missing.


Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?

Neural Information Processing Systems

PaI methods manage to find trainable subnetworks that outperform random pruning, their performance in terms of both accuracy and computational reduction is far from satisfactory compared to post-training pruning and the understanding of PaI is missing.




Exploring the impact of adaptive rewiring in Graph Neural Networks

van Nooten, Charlotte Cambier, Aronis, Christos, Shapovalova, Yuliya, Cavallaro, Lucia

arXiv.org Machine Learning

This paper explores sparsification methods as a form of regularization in Graph Neural Networks (GNNs) to address high memory usage and computational costs in large-scale graph applications. Using techniques from Network Science and Machine Learning, including Erdős-Rényi for model sparsification, we enhance the efficiency of GNNs for real-world applications. We demonstrate our approach on N-1 contingency assessment in electrical grids, a critical task for ensuring grid reliability. We apply our methods to three datasets of varying sizes, exploring Graph Convolutional Networks (GCN) and Graph Isomorphism Networks (GIN) with different degrees of sparsification and rewiring. Comparison across sparsification levels shows the potential of combining insights from both research fields to improve GNN performance and scalability. Our experiments highlight the importance of tuning sparsity parameters: while sparsity can improve generalization, excessive sparsity may hinder learning of complex patterns. Our adaptive rewiring approach, particularly when combined with early stopping, proves promising by allowing the model to adapt its connectivity structure during training. This research contributes to understanding how sparsity can be effectively leveraged in GNNs for critical applications like power grid reliability analysis.


ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models

Neural Information Processing Systems

One-shot pruning techniques offer a way to alleviate these burdens by removing redundant weights without the need for retraining. Y et, the massive scale of LLMs often forces current pruning approaches to rely on heuristics instead of optimization-based techniques, potentially resulting in suboptimal compression.